AITopics | output weight

Collaborating Authors

output weight

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

3eec5006051d9544e717067de3220198-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 17:26:42 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Fujian Province > Xiamen (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Flat Channels to Infinity in Neural Loss Landscapes

Martinelli, Flavio, Van Meegen, Alexander, Şimşek, Berfin, Gerstner, Wulfram, Brea, Johanni

arXiv.org Artificial IntelligenceNov-13-2025

The loss landscapes of neural networks contain minima and saddle points that may be connected in flat regions or appear in isolation. We identify and characterize a special structure in the loss landscape: channels along which the loss decreases extremely slowly, while the output weights of at least two neurons, $a_i$ and $a_j$, diverge to $\pm$infinity, and their input weight vectors, $\mathbf{w_i}$ and $\mathbf{w_j}$, become equal to each other. At convergence, the two neurons implement a gated linear unit: $a_iσ(\mathbf{w_i} \cdot \mathbf{x}) + a_jσ(\mathbf{w_j} \cdot \mathbf{x}) \rightarrow σ(\mathbf{w} \cdot \mathbf{x}) + (\mathbf{v} \cdot \mathbf{x}) σ'(\mathbf{w} \cdot \mathbf{x})$. Geometrically, these channels to infinity are asymptotically parallel to symmetry-induced lines of critical points. Gradient flow solvers, and related optimization methods like SGD or ADAM, reach the channels with high probability in diverse regression settings, but without careful inspection they look like flat local minima with finite parameter values. Our characterization provides a comprehensive picture of these quasi-flat regions in terms of gradient dynamics, geometry, and functional interpretation. The emergence of gated linear units at the end of the channels highlights a surprising aspect of the computational capabilities of fully connected layers.

artificial intelligence, machine learning, saddle line, (17 more...)

arXiv.org Artificial Intelligence

2506.14951

Country:

Asia (0.28)
Europe (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

3eec5006051d9544e717067de3220198-Paper-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:18:24 GMT

dormant neuron, neuron, over-active neuron, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Fujian Province > Xiamen (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Model-free front-to-end training of a large high performance laser neural network

Skalli, Anas, Sunada, Satoshi, Goldmann, Mirko, Gebski, Marcin, Reitzenstein, Stephan, Lott, James A., Czyszanowski, Tomasz, Brunner, Daniel

arXiv.org Artificial IntelligenceMar-21-2025

Artificial neural networks (ANNs), have become ubiquitous and revolutionized many applications ranging from computer vision to medical diagnoses. However, they offer a fundamentally connectionist and distributed approach to computing, in stark contrast to classical computers that use the von Neumann architecture. This distinction has sparked renewed interest in developing unconventional hardware to support more efficient implementations of ANNs, rather than merely emulating them on traditional systems. Photonics stands out as a particularly promising platform, providing scalability, high speed, energy efficiency, and the ability for parallel information processing. However, fully realized autonomous optical neural networks (ONNs) with in-situ learning capabilities are still rare. In this work, we demonstrate a fully autonomous and parallel ONN using a multimode vertical cavity surface emitting laser (VCSEL) using off-the-shelf components. Our ONN is highly efficient and is scalable both in network size and inference bandwidth towards the GHz range. High performance hardware-compatible optimization algorithms are necessary in order to minimize reliance on external von Neumann computers to fully exploit the potential of ONNs. As such we present and extensively study several algorithms which are broadly compatible with a wide range of systems. We then apply these algorithms to optimize our ONN, and benchmark them using the MNIST dataset. We show that our ONN can achieve high accuracy and convergence efficiency, even under limited hardware resources. Crucially, we compare these different algorithms in terms of scaling and optimization efficiency in term of convergence time which is crucial when working with limited external resources. Our work provides some guidance for the design of future ONNs as well as a simple and flexible way to train them.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.16943

Country:

Europe > Germany (0.28)
Asia > Japan (0.28)
Europe > France (0.27)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (0.92)
Health & Medicine (0.66)
Education > Educational Setting > Online (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

The Effects of Multi-Task Learning on ReLU Neural Network Functions

Nakhleh, Julia, Shenouda, Joseph, Nowak, Robert D.

arXiv.org Machine LearningDec-11-2024

This paper studies the properties of solutions to multi-task shallow ReLU neural network learning problems, wherein the network is trained to fit a dataset with minimal sum of squared weights. Remarkably, the solutions learned for each individual task resemble those obtained by solving a kernel regression problem, revealing a novel connection between neural networks and kernel methods. It is known that single-task neural network learning problems are equivalent to a minimum norm interpolation problem in a non-Hilbertian Banach space, and that the solutions of such problems are generally non-unique. In contrast, we prove that the solutions to univariate-input, multi-task neural network interpolation problems are almost always unique, and coincide with the solution to a minimum-norm interpolation problem in a Sobolev (Reproducing Kernel) Hilbert Space. We also demonstrate a similar phenomenon in the multivariate-input case; specifically, we show that neural network learning problems with large numbers of tasks are approximately equivalent to an $\ell^2$ (Hilbert space) minimization problem over a fixed kernel determined by the optimal neurons.

knot, neural network, neuron, (17 more...)

arXiv.org Machine Learning

2410.21696

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Sweden (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education > Focused Education > Special Education (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Recurrent Stochastic Configuration Networks with Hybrid Regularization for Nonlinear Dynamics Modelling

Dang, Gang, Wang, Dianhui

arXiv.org Machine LearningNov-25-2024

Recurrent stochastic configuration networks (RSCNs) have shown great potential in modelling nonlinear dynamic systems with uncertainties. This paper presents an RSCN with hybrid regularization to enhance both the learning capacity and generalization performance of the network. Given a set of temporal data, the well-known least absolute shrinkage and selection operator (LASSO) is employed to identify the significant order variables. Subsequently, an improved RSCN with L2 regularization is introduced to approximate the residuals between the output of the target plant and the LASSO model. The output weights are updated in real-time through a projection algorithm, facilitating a rapid response to dynamic changes within the system. A theoretical analysis of the universal approximation property is provided, contributing to the understanding of the network's effectiveness in representing various complex nonlinear functions. Experimental results from a nonlinear system identification problem and two industrial predictive tasks demonstrate that the proposed method outperforms other models across all testing datasets.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2412.0007

Country:

Asia > China > Liaoning Province > Shenyang (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > China > Jiangsu Province > Xuzhou (0.04)

Genre: Research Report (0.64)

Industry: Energy (0.94)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.71)

Add feedback

Deeper Insights into Learning Performance of Stochastic Configuration Networks

Yan, Xiufeng, Wang, Dianhui

arXiv.org Artificial IntelligenceNov-13-2024

Stochastic Configuration Networks (SCNs) are a class of randomized neural networks that integrate randomized algorithms within an incremental learning framework. A defining feature of SCNs is the supervisory mechanism, which adaptively adjusts the distribution to generate effective random basis functions, thereby enabling error-free learning. In this paper, we present a comprehensive analysis of the impact of the supervisory mechanism on the learning performance of SCNs. Our findings reveal that the current SCN framework evaluates the effectiveness of each random basis function in reducing residual errors using a lower bound on its error reduction potential, which constrains SCNs' overall learning efficiency. Specifically, SCNs may fail to consistently select the most effective random candidate as the new basis function during each training iteration. To overcome this problem, we propose a novel method for evaluating the hidden layer's output matrix, supported by a new supervisory mechanism that accurately assesses the error reduction potential of random basis functions without requiring the computation of the Moore-Penrose inverse of the output matrix. This approach enhances the selection of basis functions, reducing computational complexity and improving the overall scalability and learning capabilities of SCNs. We introduce a Recursive Moore-Penrose Inverse-SCN (RMPI-SCN) training scheme based on the new supervisory mechanism and demonstrate its effectiveness through simulations over some benchmark datasets. Experiments show that RMPI-SCN outperforms the conventional SCN in terms of learning capability, underscoring its potential to advance the SCN framework for large-scale data modeling applications.

basis function, rmpi-scn, scn-iii, (13 more...)

arXiv.org Artificial Intelligence

2411.08544

Country:

Asia > China > Jiangsu Province > Xuzhou (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Deep Recurrent Stochastic Configuration Networks for Modelling Nonlinear Dynamic Systems

Dang, Gang, Wang, Dianhui

arXiv.org Machine LearningOct-28-2024

Deep learning techniques have shown promise in many domain applications. This paper proposes a novel deep reservoir computing framework, termed deep recurrent stochastic configuration network (DeepRSCN) for modelling nonlinear dynamic systems. DeepRSCNs are incrementally constructed, with all reservoir nodes directly linked to the final output. The random parameters are assigned in the light of a supervisory mechanism, ensuring the universal approximation property of the built model. The output weights are updated online using the projection algorithm to handle the unknown dynamics. Given a set of training samples, DeepRSCNs can quickly generate learning representations, which consist of random basis functions with cascaded input and readout weights. Experimental results over a time series prediction, a nonlinear system identification problem, and two industrial data predictive analyses demonstrate that the proposed DeepRSCN outperforms the single-layer network in terms of modelling efficiency, learning capability, and generalization performance.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

2410.20904

Country:

Asia > China (0.28)
South America > Brazil (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Implicit Bias of Mirror Descent for Shallow Neural Networks in Univariate Regression

Liang, Shuang, Montúfar, Guido

arXiv.org Machine LearningOct-4-2024

We examine the implicit bias of mirror flow in univariate least squares error regression with wide and shallow neural networks. For a broad class of potential functions, we show that mirror flow exhibits lazy training and has the same implicit bias as ordinary gradient flow when the network width tends to infinity. For ReLU networks, we characterize this bias through a variational problem in function space. Our analysis includes prior results for ordinary gradient flow as a special case and lifts limitations which required either an intractable adjustment of the training data or networks with skip connections. We further introduce scaled potentials and show that for these, mirror flow still exhibits lazy training but is not in the kernel regime. For networks with absolute value activations, we show that mirror flow with scaled potentials induces a rich class of biases, which generally cannot be captured by an RKHS norm. A takeaway is that whereas the parameter initialization determines how strongly the curvature of the learned function is penalized at different locations of the input space, the scaled potential determines how the different magnitudes of the curvature are penalized.

implicit bias, mirror descent, shallow neural network, (12 more...)

arXiv.org Machine Learning

2410.03988

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Task structure and nonlinearity jointly determine learned representational geometry

Alleman, Matteo, Lindsey, Jack W, Fusi, Stefano

arXiv.org Artificial IntelligenceJan-24-2024

The utility of a learned neural representation depends on how well its geometry supports performance in downstream tasks. This geometry depends on the structure of the inputs, the structure of the target outputs, and the architecture of the network. By studying the learning dynamics of networks with one hidden layer, we discovered that the network's activation function has an unexpectedly strong impact on the representational geometry: Tanh networks tend to learn representations that reflect the structure of the target outputs, while ReLU networks retain more information about the structure of the raw inputs. This difference is consistently observed across a broad class of parameterized tasks in which we modulated the degree of alignment between the geometry of the task inputs and that of the task labels. We analyzed the learning dynamics in weight space and show how the differences between the networks with Tanh and ReLU nonlinearities arise from the asymmetric asymptotic behavior of ReLU, which leads feature neurons to specialize for different regions of input space. By contrast, feature neurons in Tanh networks tend to inherit the task label structure. Consequently, when the target outputs are low dimensional, Tanh networks generate neural representations that are more disentangled than those obtained with a ReLU nonlinearity. Our findings shed light on the interplay between input-output geometry, nonlinearity, and learned representations in neural networks.

alignment, geometry, representation, (16 more...)

arXiv.org Artificial Intelligence

2401.13558

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback